Regression with compositional data using the \(\alpha\)-transformation.
alfa.reg(y, x, a, covb = FALSE, xnew = NULL, yb = NULL)
alfa.reg2(y, x, a, xnew = NULL)For the alfa.reg() function a list including:
The time required by the regression.
The beta coefficients.
The covariance matrix of the beta coefficients, or NULL if it is singular.
The fitted values for xnew if xnew is not NULL.
For the alfa.reg2() function a list with as many sublists as the number of values of \(\alpha\). Each element (sublist) of the list contains the beta coefficients and the fitted values.
A matrix with the compositional data.
A matrix with the continuous predictor variables or a data frame including categorical predictor variables.
The value of the power transformation, it has to be between -1 and 1. If zero values are present it has to be greater than 0. If \(\alpha=0\) the isometric log-ratio transformation is applied and the solution exists in a closed form, since it the classical mutivariate regression. For the alfa.reg2() this should be a vector of \(\alpha\) values and the function call repeatedly the alfa.reg() function. For the alfa.reg3() function it should be a vector with two values, the endpoints of the interval of \(\alpha\). This function searches for the optimal vaue of \(\alpha\) that minimizes the sum of squares of the errors. Using the optimize function it searches for the optimal value of \(\alpha\). Instead of choosing the value of \(\alpha\) using cv.alfareg (that uses cross-validation) one can select it this way.
If this is FALSE, the covariance matrix of the coefficients will not be returned. If however you set it equal to TRUE and the covariance matrix is not returned it means it was singular.
If you have new data use it, otherwise leave it NULL.
If you have already transformed the data using the \(\alpha\)-transformation with the same \(\alpha\) as given in the argument "a", put it here. Othewrise leave it NULL.
This is intended to be used in the function cv.alfareg in order to speed up the process. The time difference in that function is small for small samples.
But, if you have a few thousands and or a few more components, there will be bigger differences.
Michail Tsagris.
R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.
The \(\alpha\)-transformation is applied to the compositional data first and then multivariate regression is applied. This involves numerical optimisation. The alfa.reg2() function accepts a vector with many values of \(\alpha\), while the the alfa.reg3() function searches for the value of \(\alpha\) that minimizes the Kulback-Leibler divergence between the observed and the fitted compositional values. The functions are highly optimized.
Tsagris M. (2015). Regression analysis with compositional data containing zero values. Chilean Journal of Statistics, 6(2): 47-57. https://arxiv.org/pdf/1508.01913v1.pdf
Tsagris M.T., Preston S. and Wood A.T.A. (2011). A data-based power transformation for compositional data. In Proceedings of the 4th Compositional Data Analysis Workshop, Girona, Spain. https://arxiv.org/pdf/1106.1451.pdf
Mardia K.V., Kent J.T., and Bibby J.M. (1979). Multivariate analysis. Academic press.
Aitchison J. (1986). The statistical analysis of compositional data. Chapman & Hall.
cv.alfareg, alfa.slx
data(fadn)
y <- fadn[, 3:7]
x <- fadn[, 8]
mod <- alfa.reg(y, x, 0.2)
Run the code above in your browser using DataLab